AITopics

Country:

North America > United States > California (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Neural Information Processing SystemsDec-26-2025, 14:11:25 GMT

Multi-Head Adapter Routing for Cross-Task Generalization

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ($\texttt{Poly}$) jointly learns an inventory of adapters and a *routing* function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. In this paper, we investigate the role that adapter routing plays in its success and design new variants based on our findings.First, we build on the intuition that finer-grained routing provides more expressivity. Hence,we propose $\texttt{MHR}$ (Multi-Head Routing) which combines *subsets* of adapter parameters and outperforms $\texttt{Poly}$ under a comparable parameter budget; by only fine-tuning the routing function and not the adapters ($\texttt{MHR}$-$z$) we achieve competitive performance with extreme parameter efficiency. Second, we find that $\texttt{Poly}$/$\texttt{MHR}$ performance is a result of better multi-task optimization, rather than modular inductive biases that facilitate adapter recombination and local adaptation, as previously hypothesized.

adapter, multi-head adapter routing, texttt, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Gong, Changqing, Qin, Huafeng, El-Yacoubi, Mounim A.

Understanding Cross Task Generalization in Handwriting-Based Alzheimer's Screening via Vision Language Adaptation

arXiv.org Artificial IntelligenceNov-11-2025

Alzheimer's disease is a prevalent neurodegenerative disorder for which early detection is critical. Handwriting-often disrupted in prodromal AD-provides a non-invasive and cost-effective window into subtle motor and cognitive decline. Existing handwriting-based AD studies, mostly relying on online trajectories and hand-crafted features, have not systematically examined how task type influences diagnostic performance and cross-task generalization. Meanwhile, large-scale vision language models have demonstrated remarkable zero or few-shot anomaly detection in natural images and strong adaptability across medical modalities such as chest X-ray and brain MRI. However, handwriting-based disease detection remains largely unexplored within this paradigm. To close this gap, we introduce a lightweight Cross-Layer Fusion Adapter framework that repurposes CLIP for handwriting-based AD screening. CLFA implants multi-level fusion adapters within the visual encoder to progressively align representations toward handwriting-specific medical cues, enabling prompt-free and efficient zero-shot inference. Using this framework, we systematically investigate cross-task generalization-training on a specific handwriting task and evaluating on unseen ones-to reveal which task types and writing patterns most effectively discriminate AD. Extensive analyses further highlight characteristic stroke patterns and task-level factors that contribute to early AD identification, offering both diagnostic insights and a benchmark for handwriting-based cognitive assessment.

data mining, large language model, machine learning, (18 more...)

2511.05841

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Mining (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

arXiv.org Artificial IntelligenceOct-21-2025

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

Zhou, Jiaming, Ye, Ke, Liu, Jiayi, Ma, Teli, Wang, Zifan, Qiu, Ronghe, Lin, Kun-Yu, Zhao, Zhilin, Liang, Junwei

The generalization capabilities of vision-language-action (VLA) models to unseen tasks are crucial to achieving general-purpose robotic manipulation in open-world settings. However, the cross-task generalization capabilities of existing VLA models remain significantly underexplored. To address this gap, we introduce AGNOSTOS, a novel simulation benchmark designed to rigorously evaluate cross-task zero-shot generalization in manipulation. AGNOSTOS comprises 23 unseen manipulation tasks for testing, distinct from common training task distributions, and incorporates two levels of generalization difficulty to assess robustness. Our systematic evaluation reveals that current VLA models, despite being trained on diverse datasets, struggle to generalize effectively to these unseen tasks. To overcome this limitation, we propose Cross-Task In-Context Manipulation (X-ICM), a method that conditions large language models (LLMs) on in-context demonstrations from seen tasks to predict action sequences for unseen tasks. Additionally, we introduce a dynamics-guided sample selection strategy that identifies relevant demonstrations by capturing cross-task dynamics. On AGNOSTOS, X-ICM significantly improves cross-task zero-shot generalization performance over leading VLAs. We believe AGNOSTOS and X-ICM will serve as valuable tools for advancing general-purpose robotic manipulation.

demonstration, large language model, natural language, (17 more...)

2505.1566

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsAug-16-2025, 18:41:53 GMT

8a0d3ae989a382ce6e50312bc35bf7e1-Paper-Conference.pdf

large language model, machine learning, natural language, (19 more...)

Country:

North America > United States > California (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

arXiv.org Artificial IntelligenceMay-23-2025

T2I-ConBench: Text-to-Image Benchmark for Continual Post-training

Huang, Zhehao, Liu, Yuhang, Lou, Yixin, He, Zhengbao, He, Mingzhen, Zhou, Wenxing, Li, Tao, Li, Kehan, Huang, Zeyi, Huang, Xiaolin

Continual post-training adapts a single text-to-image diffusion model to learn new tasks without incurring the cost of separate models, but naive post-training causes forgetting of pretrained knowledge and undermines zero-shot compositionality. We observe that the absence of a standardized evaluation protocol hampers related research for continual post-training. To address this, we introduce T2I-ConBench, a unified benchmark for continual post-training of text-to-image models. T2I-ConBench focuses on two practical scenarios, item customization and domain enhancement, and analyzes four dimensions: (1) retention of generality, (2) target-task performance, (3) catastrophic forgetting, and (4) cross-task generalization. It combines automated metrics, human-preference modeling, and vision-language QA for comprehensive assessment. We benchmark ten representative methods across three realistic task sequences and find that no approach excels on all fronts. Even joint "oracle" training does not succeed for every task, and cross-task generalization remains unsolved. We release all datasets, code, and evaluation tools to accelerate research in continual post-training for text-to-image models.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2505.16875

Genre: Research Report (1.00)

Industry:

Media > Photography (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(3 more...)

Neural Information Processing SystemsJan-19-2025, 19:33:18 GMT

Multi-Head Adapter Routing for Cross-Task Generalization

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ( \texttt{Poly}) jointly learns an inventory of adapters and a *routing* function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. In this paper, we investigate the role that adapter routing plays in its success and design new variants based on our findings.First, we build on the intuition that finer-grained routing provides more expressivity. Hence,we propose \texttt{MHR} (Multi-Head Routing) which combines *subsets* of adapter parameters and outperforms \texttt{Poly} under a comparable parameter budget; by only fine-tuning the routing function and not the adapters ( \texttt{MHR} - z) we achieve competitive performance with extreme parameter efficiency. Second, we find that \texttt{Poly} / \texttt{MHR} performance is a result of better multi-task optimization, rather than modular inductive biases that facilitate adapter recombination and local adaptation, as previously hypothesized.

adapter, multi-head adapter routing, texttt, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJun-18-2024

From Instance Training to Instruction Learning: Task Adapters Generation from Instructions

Liao, Huanxuan, Xu, Yao, He, Shizhu, Zhang, Yuanzhe, Hao, Yanchao, Liu, Shengping, Liu, Kang, Zhao, Jun

Large language models (LLMs) have acquired the ability to solve general tasks by utilizing instruction finetuning (IFT). However, IFT still relies heavily on instance training of extensive task data, which greatly limits the adaptability of LLMs to real-world scenarios where labeled task instances are scarce and broader task generalization becomes paramount. Contrary to LLMs, humans acquire skills and complete tasks not merely through repeated practice but also by understanding and following instructional guidelines. This paper is dedicated to simulating human learning to address the shortcomings of instance training, focusing on instruction learning to enhance cross-task generalization. Within this context, we introduce Task Adapters Generation from Instructions (TAGI), which automatically constructs the task-specific model in a parameter generation manner based on the given task instructions without retraining for unseen tasks. Specifically, we utilize knowledge distillation to enhance the consistency between TAGI developed through Learning with Instruction and task-specific models developed through Training with Instance, by aligning the labels, output logits, and adapter parameters between them. TAGI is endowed with cross-task generalization capabilities through a two-stage training process that includes hypernetwork pretraining and finetuning. We evaluate TAGI on the Super-Natural Instructions and P3 datasets. The experimental results demonstrate that TAGI can match or even outperform traditional meta-trained models and other hypernetwork models, while significantly reducing computational requirements.

hypernetwork, instruction, tagi, (14 more...)

2406.12382

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-19-2023

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Qin, Chengwei, Li, Qian, Zhao, Ruochen, Joty, Shafiq

Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity.

computational linguistic, proceedings, target task, (13 more...)

2302.08143

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
North America > United States > Washington > King County > Seattle (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(24 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceNov-1-2023

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

Kung, Po-Nien, Yin, Fan, Wu, Di, Chang, Kai-Wei, Peng, Nanyun

Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions. However, how to select new tasks to improve the performance and generalizability of IT models remains an open question. Training on all existing tasks is impractical due to prohibiting computation requirements, and randomly selecting tasks can lead to suboptimal performance. In this work, we propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks. We represent the informativeness of new tasks with the disagreement of the current model outputs over perturbed prompts. Our experiments on NIV2 and Self-Instruct datasets demonstrate that our method consistently outperforms other baseline strategies for task selection, achieving better out-of-distribution generalization with fewer training tasks. Additionally, we introduce a task map that categorizes and diagnoses tasks based on prompt uncertainty and prediction probability. We discover that training on ambiguous (prompt-uncertain) tasks improves generalization while training on difficult (prompt-certain and low-probability) tasks offers no benefit, underscoring the importance of task selection for instruction tuning.

active instruction tuning, cross-task generalization, prompt sensitive task

2311.00288

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)